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Simple formulae are often used to estimate the sensitivity of coded mask X-ray or gamma-ray telescopes, but 
these are strictly only applicable if a number of basic assumptions are met. Complications arise, for example, 
if a grid structure is used to support the mask elements, if the detector spatial resolution is not good enough 
to completely resolve all the detail in the shadow of the mask or if any of a number of other simplifying 
conditions are not fulfilled. We derive more general expressions for the Poisson- noise-limited sensitivity of 
astronomical telescopes using the coded mask technique, noting explicitly in what circumstances they are 
applicable. The emphasis is on using nomenclature and techniques that result in simple and revealing results. 
Where no convenient expression is available a procedure is given which allows the calculation of the sensitivity. 
We consider certain aspects of the optimisation of the design of a coded mask telescope and show that when 
the detector spatial resolution and the mask to detector separation are fixed, the best source location accuracy 
is obtained when the mask elements are equal in size to the detector pixels. © 2007 Optical Society of America 
OCIS codes: 340.7430, 100.1830, 110.4280. 


1. Introduction 

Coded mask telescopes have been widely used in X-ray 
and gamma-ray astronomy, particularly at those ener- 
gies where other imaging techniques are not available or 
where the wide field of view possible with the technique 
is important. Recent examples of astronomical applica- 
tions of the technique include the BAT instrument on the 
SWIFT spacecraft [1] and three of the instruments on 
INTEGRAL [2]; other examples are described in [3-6]. 
The technique is based on recording the shadow of a 
mask containing both transparent and opaque regions in 
a pattern that allows an image of the source of the radia- 
tion to be reconstructed. Coded mask imaging has been 
reviewed by Caroli et al. [7], with more recent discussion 
of some aspects of the technique by Skinner [8] . 

Many possible mask patterns have been discussed. The 
mask may simply contain randomly placed holes [9,10] 
or it may be based on geometric patterns [11, 12] - in- 
deed almost any design can be used without losing the 
imaging capability [8] . Most work has been based on pat- 
terns comprising holes placed on a regular rectangular or 
hexagonal grid according to some algorithm. Discussion 
of the choice of algorithm for placing the holes has con- 
centrated on designs in which the (cyclic) autocorrela- 
tion function of the pattern, sampled at shifts corre- 
sponding to a whole number of cells of the grid, is bi- 
valued with a central peak and flat wings. An extensive 
literature [13-28] exists on arrays which have this prop- 
erty, which are usually termed ‘Uniformly Redundant 
Arrays' (URAs). For URA-based masks, in certain well 
defined circumstances, cross-correlation of the recorded 
data with an array which corresponds to the mask pat- 
tern (with a scaling and offset applied) leads to images 
with a point source response function (PSF) having a 
central peak and perfectly flat side- lobes. As image re- 


construction by cross-correlation can be shown (again 
in specific circumstances, to be discussed below) also to 
yield the best possible signal-to- noise ratio, such solu- 
tions have attracted widespread attention. 

Variants of URAs have been proposed (e.g. Modified 
Uniformly Redundant Arrays, MURAs, [29]; see also 
[30]) in which the same ideal PSF is obtained when the 
reconstructing array differs marginally from the coding 
pattern. Provided the number of elements is large, the 
signal- to- noise ratio is essentially the same as for URAs. 

URAs (and MURAs, etc) provide a mathematically 
satisfying solution to the problem of mask design, but 
their advantages in practice are less evident. The cir- 
cumstances in which the ideal response is obtained re- 
late to the cyclic nature of the patterns. The PSF is free 
from spurious responses (‘ghosts 1 or ’si delobes 1 ) only if 
the shadow recorded is always of a whole number of cy- 
cles of a repeating pattern. It can be arranged that this 
condition is met over a limited field (the so-called ‘fully 
coded field of view 1 , FCFOV) but for sources outside 
this region (in the ‘partially coded field of view 1 , PC- 
FOV) the shadow of the edge of the mask will appear in 
the recorded data. Sources in the PCFOV can produce 
spurious responses within the FCFOV and vice-versa. It 
is possible to block with a collimator consisting of slats 
or tubes the flux from sources in the PCFOV, but as 
pointed out by [31] this not only narrows the observable 
field but results in attenuation of the recorded signal 
even within the FCFOV. Thus it is often the case that 
random mask patterns are as good as any other and in- 
deed a random pattern was selected for the mask of the 
very successful BAT instrument on the SWIFT satel- 
lite [32]. 

The ghost images and other imaging artifacts that 
arise from partial coding, from non-uniform background, 
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or from other effects present in real instruments can be 
alleviated provided one has a good understanding of the 
instrument and adequate computing resources. Although 
for sufficiently long observations or combinations of ob- 
servations, these systematic errors will inevitably become 
important, many methods are available to reduce them 
[33-43]. Typically these involves fitting and subtracting 
bright sources or otherwise taking them into account, 
and carefully modeling background non-uniformities. 

Even with advanced image reconstruction techniques, 
Poisson (or ‘photon’) noise due to limited counting 
statistics leads to random errors, placing a limit on the 
sensitivity. This limit is particularly important for very 
short observations of relatively bright sources, as is the 
case in the detection of gamma-ray bursts, for example. 
However even for long term survey observations it places 
an intrinsic limit on the sensitivity that can be achieved, 
however sophisticated the analysis technique. 

We here consider the calculation of the statistical limit 
to the significance with which a point source can be ob- 
served in the presence of Poisson noise on both the flux 
from the source and the detector background. We pay 
particular attention to the assumptions that are made 
in the derivation of the formulae and the circumstances 
in which they are valid. In section 2 we list the assump- 
tions that have been made, explicitly or implicitly, in 
many previous approaches to this problem. In successive 
sections we attempt to provide useful expressions for the 
signahto-noise ratio where subsets of these assumptions 
do not hold. As mentioned above, we concentrate on 
the Poisson- noise- limited sensitivity of the instrument. 
Effectively we consider observations short enough that 
systematic errors are well below the limit imposed by 
statistics. In a well designed instrument and with ap- 
propriate treatment of the data such observations can 
nevertheless be comparatively long - particularly if the 
telescope orientation is ‘dithered’ or scanned during the 
observation to reduce the systematic noise, as forms part 
of the INTEGRAL observing strategy [2], as is impor- 
tant for SWIFT/BAT survey work [44], and is planned 
for EXIST [45,46]. 

2. Assumptions frequently made 

Simple analyses of the signal-to-noise ratio obtainable 
with a coded mask telescope often assume, explicitly or 
implicitly, that the following conditions are met: 

L Half of the mask elements are open, half closed 
(mask element open fraction f e = ^). 

2. The holes are identical and equal in size to the pitch 
of the grid on which they are placed. For example, 
there is no supporting structure of the sort illus- 
trated in Figure 1 and the overall open fraction / is 
then equal to / e . 

3. The measurement uncertainty is the same for every 
detector element. We here characterize the source 
strength by the number of counts per unit area of 


detector where the mask is open, 5, and the back- 
ground by Bf, the number of background events 
per unit area of detector. So this implies assuming 
that Bf >> S. The subscript / is a reminder that 
as the background generally contains a significant 
contribution from diffuse sky emission, Bf will gen- 
erally be a function of / and of the solid angle of the 
field of view. It is often convenient to assume that 
other sources in the field of view give rise to flux 
that is uncorrelated with the shadow pattern cor- 
responding to the source under consideration. Their 
contribution to the detector counts can then be con- 
sidered to be smeared out and included in Bf. Any 
such component of Bf , too, will depend on /. 

4. Each mask element is either totally opaque (to = 0) 
or totally transparent ( t\ — 1). If this condition is 
not met, and if there is a background component 
due to the sky or other sources, then Bf will also 
depend on the actual values of to,ti. 

5. The number of events is such that Poisson statistics 
may be treated as Gaussian. It will be shown in §5 
that this supposition is a good one except in the 
most extreme circumstances, so that even where it 
is formally still being made below, it may often be 
ignored. 

6. The detector has perfect spatial resolution so that 
the exact position of arrival of each photon is known, 
as opposed to having finite size pixels or a realis- 
tic continuous position readout with some measure- 
ment uncertainty. 

The signal-to-noise ratio most often evaluated is the 
estimate of the intensity of the flux from a source, rela- 
tive to the uncertainty in its measurement ( S/a s in the 
terminology used below). This can be different from the 
value relative to the noise in the absence of the source 
(S/07 below). We will generally consider the former pa- 
rameter because the latter can readily be obtained from 
the same formulae, but where relevant we will note as 
an assumption that it is indeed S/a s that is required: 

7. The relevant signal-to-noise ratio is the estimate of 
the intensity of the flux from a source relative to the 
uncertainty in its measurement. This assumption is 
never needed if assumption 3 is made, as the two 
signal-to-noise ratio estimates are then the same. 

There are two more simplifying assumptions that we will 
generally continue to make: 

8. The sensitivity to be discussed is a typical value over 
part or all of the region imaged and/or the mask el- 
ements through which radiation is received are suf- 
ficiently numerous that numbers based on the aver- 
age open fraction may be used. Results are then the 
same for (M)URA-based masks and random ones. 
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9. The measurement uncertainty due to background 
counting statistics is uniform across the detector 
plane. Sometimes a highly non-uniform background 
may be modelled and subtracted out. Even if the ex- 
pectation level of the residual background is then ev- 
erywhere zero, the random fluctuations can be more 
important in some regions than others, in which case 
this assumption is not valid. 

An example of when assumption 8 is important arises 
when the detector plane consists of pixels that are on 
the same pitch as the mask elements or one that is a 
submultiple of it. If one considers only source directions 
such that the detector pixels are either fully shadowed 
or fully illuminated, then the sensitivity is the same as 
if the detector had perfect spatial resolution. However 
in other directions the sensitivity is up to a factor of 
2 poorer. It is generally most useful to average out such 
effects. An exception arises if an observation, or sequence 
of observations, is planned such that, for a particular 
source selected for study, the shadow boundaries always 
fall between detector pixels (e.g. the “7 point hexagonal” 
pointings of the INTEGRAL/SPI instrument [47]). 

Some aspects of a real system may invalidate sev- 
eral of these assumptions. For example at high energies 
masks are likely to be partially transparent, contrary 
to assumption 4, and the large thicknesses which are 
employed to minimize the consequent loss in sensitiv- 
ity mean that the apparent hole size and shape becomes 
a function of off-axis angle and one no longer has the 
simple situation assumed in 2. 

3. Relaxing conditions 1,2 and 3 - allowing 
masks with arbitrary pattern and detector 
background not necessarily dominant 

We will consider first a coded mask telescope in which 
the mask pattern is not necessarily 50% open, 50% 
closed (breaking assumption 1). Furthermore we suppose 
that the elements are not necessarily simple squares or 
hexagons and the mask pattern may contain structures 
other than the elements themselves (breaking assump- 
tion 2), an important example being the presence of a 
supporting grid as shown in Figure 1. At the same time 
we will consider the general case where it is not nec- 
essarily true that S Bf, breaking 3. This case, and 
the associated issue of the optimum open fraction of the 
mask when S jt Bj, have been discussed in the liter- 
ature [10, 13, 28, 48-51], but we here try to provide a 
unified approach and to correct some errors that have 
arisen. 

A. Signal-to-noise ratio 

The important parameter in this case is the ‘open frac- 
tion' of the mask, /. This takes into account the fraction 
of mask elements that are open, / e , but may also be af- 
fected by other aspects of the design. Thus / = / e (m/p) 2 
in the case of the example structure in Figure 1. As for 
the moment we still ignore any effects of finite detector 



Fig. 1. A part of a mask in which the opaque elements are 
supported by a grid with bar width g and pitch p, leaving 
holes of width m. The plot beneath shows the response 
of a square detector pixel of side d as it is moved across 
the mask shadow along the line shown. The widths of 
the transition regions are indicated in the case where 
g < d < m. 


resolution, the total detector area A may be considered 
to be divided into an area f A that sees the source plus 
detector background (cosmic and particle) and an area 
(1 — f)A that just measures background. If the source is 
in the PCFOV then A should be taken as the area of that 
part of the detector that would, but for the mask, see the 
source. The expectation values for the counts measured 
in the two regions are 

C s = fA(S+B } ) (1) 


C B = (1 - S)AB } 


Our estimate of the source strength is then 


( 2 ) 


a Cs Cb 

SA (1 -S') A' 

with variance 

_2 = _CS_ CB 
S VA)»*(1-WA* 

S + Bf Br 

JA (1 — })A" 


( 3 ) 

( 4 ) 

( 5 ) 


The signal-to-noise ratio of the source flux measure- 
ment is thus 


S_ / /( \-f)A 
*s ]){l-f)S + B f 


( 6 ) 


(assumptions 4-9). 
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Note that in this case we obtain the same result 
whether / deviates from \ because of a supporting grid 
as in Figure 1, or whether it simply reflects the fraction 
of mask elements which are open (/ e ) or a combination 
of the two. Indeed it applies to an arbitrary mask design. 

Some particular cases are (a) the limiting case Bf 
5, for which 




/(1-/)A 

B f 


( 7 ) 


(assumptions 3-9) 

and (b) the special case / = for which the signal-to- 
noise ratio can be written 

A = (s/ 2 M = (s\ (8) 

y/[(S/2 ) + B 1/a ] A \ a s)t*l 

(assumptions 1, 4-9), 
which is simply the number of counts due to the source 
divided by the square root of all the counts (source plus 
background, the latter including any contribution from 
other sources in the field of view). Below we will use 
this value as a reference against which to compare the 
sensitivity in other cases. 

Finally (c) when both of these conditions apply, we 
have the widely quoted expression for the signal- to- noise 
ratio for an ideal 50% open coded mask instrument in 
the background dominated case 


s_ _ s rr 

&s 2 y Bi/2 


( 9 ) 


(assumptions 1, 3-9). 


The signal- to-noise ratio as defined above is the ratio 
of the source strength to the uncertainty in its measure. 
For knowing whether a source is significantly detected 
or not, a more appropriate measure is the ratio of the 
measured flux to the noise in the surrounding region of 
an image. Consider a test position away from the true 
source position. The expected distribution of events for 
a hypothetical source at this position should ideally be 
uncorrelated with the actual distribution due to the real 
source. If there is some residual correlation, then system- 
atic effects (ghosts or si delobes) will result. However we 
are here concerned with random noise so we may suppose 
that all the recorded events will be divided between the 
region measuring the flux from the hypothetical source 
and that measuring the background, in proportion with 
the areas of the two regions 

C s =fA(fS + B f ), (10) 


C' B = (l-f)A(fS + B f ). (11) 


With these values Equation 3 gives an expectation value 


of zero 1 , with variance 


<*i 


1 

M i-/) 



( 12 ) 


leading to 


S_ 

°i 


fi 1~/)A 
fS + Bf 


(13) 


(assumptions 4-6, 8,9), 
which differs from Equation 6 only in the factor multi- 
plying S and can be obtained from it by omitting that 
term and including the source flux in the background. 
The two are identical if Bf S or if / = 


B . Optimum choice of f 

If Bf iS, and if Bf is independent of S, then considera- 
tion of Equation 7 shows that the optimum open fraction 
is 50%. However in general the background may not dom- 
inate. Bf is the combination of a component intrinsic to 
the detector plus one due to a combination of diffuse sky 
emission and the smeared effect of sources, other than 
the one of interest, in the field of view. Thus we may 
write B f = B det + fB sky - Putting b = (B sky /B d et) and 
s = (S/ Bdet) and solving for the optimum value of / one 
finds 



(assumptions 4-9). 

Equation 14 is equivalent to the expression given by in 
’t Zand et al. [51]. Although their result is correct, those 
authors state that it is the same as that of Fenimore [49], 
which in fact contains an error (as pointed out by Accorsi 
et al. [28]). 

We can measure the advantage, g , of using the op- 
timum open fraction as the signal- to- noise ratio with 
* / = f op t , relative to that for / = | given by Equation 
8. After much manipulation it turns out that g depends 
only on the value of f opt and is independent of the par- 
ticular combination of s and b which led to that value. 
It is simply given by 

g(b,sf = l + 4^f opt (b,s)- A . (15) 

Figure 2 provides a convenient nomogram for f opt and 
g which also illustrates some conclusions that can be 
drawn. One sees, for example, how large b (strong back- 
ground from the sky or from sources other than that of 
interest) favors low /, moving towards the single pin- 
hole camera extreme. On the other hand, for studying a 

J This illustrates an approximation in the approach used here. 
The shadow of a source at trial positions away from the peak can- 
not be totally independent of that expected for a source at the 
peak position [52]. The process described here is equivalent to cor- 
relation with a mean-subtracted form of the expected count dis- 
tribution, which must produce a function with a mean of zero. 
Thus a positive peak implies a negative mean level elsewhere. The 
effect goes inversely with the number of resolution elements in the 
field of view and becomes negligible for sufficiently large N. 
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Optimum Open fraction (f opt ) 

Fig. 2. Top: Nomogram for determining the optimum 
mask open fraction f op t given the parameters b and 
s, which are respectively the sky background and the 
strength of the observed source, relative to the intrin- 
sic detector background. f opt can be read from the hor- 
izontal scale at the intersection of lines of constant s 
(continuous lines, logarithmically spaced) and of con- 
stant b (dashed lines, also logarithmically spaced). The 
arrows illustrate the effect of a finite detector resolution 
(see §6C). Bottom: The signal-to-noise ratio when using 
/ = fopt relative to that with / = \ . 


bright source (large s) high /, more like an open “light 
bucket’ 1 , are preferable. 

As has been noted by other authors, the advantage 
in signal-to-noise ratio to be obtained by using a value 
of / other than ^ is small except in the most extreme 
circumstances. We note however that the low values 
of / marginally favored from this point of view when 
S Bdet B s k y can lead to important data handling 
and telemetry reductions, particularly when information 
about each event is recorded. 

If the source to be studied dominates over the effects 
of intrinsic background by more than does the combi- 
nation of all other sources and the diffuse sky emission 
(s > 6), the optimum fraction can be larger than The 
circumstances in which this is most likely to be rele- 
vant is when the objective is to obtain information very 
quickly, for example when studying short bursts of emis- 
sion. We note, however that this conclusion depends on 
the definition of signal-to-noise ratio. 

If it is the detectability of a source that is important, 
rather than the precision with which its intensity can be 
measured, then it is Sfoi (Equation 13) that should be 
optimized rather than S/crs (Equation 6). The flux from 
the source should then be included in B s k y < not 5, and 
Equations 14 and 15 and Figure 2 used with s set to zero 


(or a small value). Figure 2 shows that f op t is always less 
than | in this case. 

4. Imperfect mask opacity/ transparency : re- 
laxing assumption 4 

If the mask elements are not perfectly 
opaque/ transparent but have transmissions 
equations 1,2 take the form 

= fAfaS+Bf) (16) 

Cb = (l-f)A(toS + B f ). (17) 

Note that Bf will in this case be a function of to^ti, as 
well as of /. Following the same logic as above one finds 

l= S(, '-‘Vi<wST^r57 (I8 > 

(assumptions 5-9) 

and for source detection 

I = <i9) 

(assumptions 5, 6, 8, 9). 

Thus the only changes necessary to allow for a uniform 
absorption in the nominally open areas (ti < 1) and/or 
for uniform leakage through the closed ones (to > 0) are 
to multiply the signal-to-noise ratio by a factor t\ — to and 
to correct the noise contribution due to source counts if 
this is not negligible. The reason for the simple form of 
the multiplying factor will become evident in Section 6. 

Accorsi et al. [28] have treated the question of the 
optimum / when the mask is leaky (to > 0), but un- 
fortunately the expression they give contains some ty- 
pographical errors, as well as being rather complex (al- 
though their equation for the signal-to-noise ratio is cor- 
rect, that for f op t twice has 2 1 in place of t). The general 
case, to > 0 and t\ < 1, can, however, be handled by 
using the equations and nomogram of §3B with adjusted 
parameters 

s' = tis-Mofr 

b' ~ t-os H~ t\b. (20) 

in place of s, b. 

5. Gaussian or Poissonian statistics? : Assump- 
tion 5 

In the above, the only respect in which it has been as- 
sumed that Gaussian statistics are applicable is in char- 
acterizing the signal-to-noise ratio and the significance 
of detection in terms of a standard deviation calculated 
as the root of the sum of the variances (Equation 4, or its 
equivalents in other cases). If both Cs and Cb are small 
this is a slight simplification as the distribution of S will 
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Counts in open region (C s ) 

Fig. 3. The numbers of events needed in the region of 
the detector plane which are exposed to the source (Cs) 
and that shadowed by the mask ( Cb ) needed to achieve 
a given level of detection significance. The dashed and 
continuous curves are calculated using the Cash statistic 
(Equation 21) which correctly handles Poisson statistics 
and are for / = 0.4 and / = 0.6 respectively. Dotted 
curved are based on 5x 2 (Equation 22) and so use the 
approximation of Gaussian statistics. The curves for each 
family are at levels of 17.3, 26.3, 37.4 and 50.4 (left to 
right), which for both the Cash statistic and \ 2 corre- 
spond to 4a, 5a, 6a and 7a respectively for one degree of 
freedom (appropriate if the source position is known). 


not be strictly Gaussian (or Poissonian). The resulting 
effects are tiny except where only a very few events in 
total are involved. A potentially relevant case arises in 
the detection of a very brief burst - one occurring in 
so short a time that the background is very small. In 
this case the significance of detection should ideally be 
expressed in terms of likelihood. Often in these circum- 
stances the background rate (and even its distribution 
over the detector) will be well determined by consider- 
ing data before, and perhaps after, the burst. However 
to place the problem in the same context as the above 
discussion we consider the case where there is no infor- 
mation available outside the time of the event itself. For 
the same reason we will consider the significance of de- 
tection at a given position , without considering the de- 
grees of freedom associated with finding the location of 
the event. 

If we observe counts Cs and Cb in the exposed and 
shadowed parts of the detector 2 , the difference in the 
Cash likelihood statistic [53] between the null (back- 
ground only) hypothesis and the hypothesis in which a 


2 The same symbols are used indiscriminately here for the ob- 
served and expected numbers of events as the one is the best esti- 
mate of the other. 


source is present at the supposed position is found to be 



-(C 5 + C B )ln(C 5 + CB) (21) 

(assumptions 4, 6, 8, 9). 
Figure 3 shows, for two examples of /, the combinations 
of numbers of events which give particular levels of con- 
fidence in the detection of a source, calculated according 
to Equation 21. For comparison the corresponding con- 
tours can be calculated on the Gaussian assumption by 
evaluating the \ 2 parameter 

e 2 = (C S - f{Cs + Cb)) 2 
X f(C s + C B ) 

(C B - (1 - f)(Cs + Cb)) 2 
(1 -f)(C s + C B ) 

(Cg-/(C^ + C B )) 2 , , 

f(l-f)(C s + C B ) 1 j 

(assumptions 4-6, 8, 9). 
In fact 6\ 2 in equation 22 is just the square of S / a i from 
equation 13. 

Contours of constant x 2 > calculated according to 
Equation 22 are also shown in Figure 3. As the two statis- 
tics both follow the x 2 (l) distribution, a direct compar- 
ison can be made. It can be seen that there is relatively 
little difference between the two except in extreme cases 
where the total number of events in the background re- 
gion number of events is quite low. The Gaussian as- 
sumption can still be good even if number of events per 
detector element is small (even <£ 1), provided the to- 
tal number of background events, Cb, is more than a 
dozen or so. For this reason assumption 5 is almost al- 
ways valid. 


6. Finite detector resolution — Assumption 6 

There remains the assumption that the detector has per- 
fect spatial resolution. Unfortunately relaxing this as- 
sumption has a major impact. The situation is simplest if 
we again assume background dominated conditions (as- 
sumption 3). We consider this case first. 


A. Background dominated case 


The case where the mask shadow is recorded by a de- 
tector with limited spatial resolution and where Bf 5 
is considered by Skinner [8]. It is shown there that in 
this case the sensitivity relative to that of a reference 
system with / = | (Equation 8) is given by the “coding 
power”, A. so that 


where 



t-sE^-Ge*)' 


(23) 


(24) 
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(assumptions 3, 5, 8, 9) 
and where Pi is the response in detector element i to 
a source at the position under consideration, relative to 
that for a fully exposed element to the same source (so 
that 0 < P t < 1). Assumption4 is not needed as tQ,t\ 
can be taken into account in calculating the P*. Equa- 
tion 24 shows that A is simply twice the rms value of Pi . 
It can be quite different even for two close directions as 
the shadows of the edges of mask elements may fall dif- 
ferently with respect to the detector elements. If there 
is a large number of detector elements and their pitch 
is not commensurate with that of the mask elements 
then all relative phases of the two arrays will occur with 
about the same frequency and any such variations will 
be small. If there is a simple ratio between the pitches, 
then an average value of A may be used as a measure of 
the mean sensitivity, averaged over different sky direc- 
tions (different shifts of the mask shadow) i.e. we invoke 
assumption 8. 

The concept of coding power is a very useful one. It 
can be used to derive Equation 7, which is the limiting 
form of Equation 6 when Bf 5, or the corresponding 
form of Equations 18 (or 19) in the same limit. But it 
also allows quantitative treatment of the loss in sensitiv- 
ity due to finite detector resolution in any background 
limited case. A detector with limited spatial resolution 
records only a blurred version of the shadow of the mask, 
as illustrated in Figure 1 and the rms deviation of P is 
consequently reduced. This approach was used in [8] to 
obtain an expression for the sensitivity of a telescope 
having square element of side m and 50% open fraction 
when the detector has square pixels of finite size d. Gen- 
eralizing the result obtained there to allow for masks 
with any open fraction and with imperfect transmission 
and opacity, one finds that the sensitivity must be mul- 
tiplied by a coding power factor 

A = ( ! “ 3“) (*i “ *o) \/ 4/(1 — /) if m>d 

= ^ (l - ^) (h ~ <o)\/ 4/(1 - /) if m<d 

(25) 

(assumptions 3, 5, 8, 9). 

Analytic solutions in even more general cases are 
messy and not very revealing, but numerical calculation 
of A allows the sensitivity of a proposed or actual system 
to be estimated. As an example we take the case shown 
in Figure 1, which is relevant to the EXIST project. The 
shadow of a mask with square elements of side m, sup- 
ported by a grid structure with bar width g ? is imagined 
to be recorded using a detector having square pixels of 
side d. The variation of P along one particular line is 
illustrated in the case g < d < m. 

The calculation can be simplified by noting that the 
mask pattern can be described as the convolution of a 
single mask hole, side m, with a sparse ‘bed of nails* (2- 
d Shah [54]) function, pitch p — m -F g, in which only a 



Fig. 4. The loss in sensitivity if the detector spatial res- 
olution is not perfect. The continuous line (a) is for a 
simple mask with square elements having open fraction 
/ = 0.5 and a detector with square pixels (Equation 24). 
The dotted line (b) shows the corresponding curve with 
/ = 0.4. The dashed line (c) is for a mask in which 50% of 
the elements are open but in which a supporting grid like 
that in Fig. 1 reduces the transparency to / = 0.4. If the 
detector resolution is good (low d/p) the grid provides 
coding and so the curve approaches (b). If the detector 
resolution is too poor, the grid simply attenuates the flux 
and reduces the sensitivity. (Assumptions 3-5, 8, 9). 

fraction f e of the spikes are present. The response func- 
tion is then obtained by a further convolution with the 
form of a detector element. Use of the convolution the- 
orem allows the Fourier Transform of the response to 
be obtained and ParsevaFs theorem then gives its mean 
square value. 

Example results are shown in Figure 4. When the sup- 
porting grid is present there is an important loss in sen- 
sitivity unless the detector pixels are very small. In effect 
the loss is due to an increased fraction of intermediate, 
‘gray’, levels because the shadow of the fine grid is poorly 
resolved. 

In real conditions the shadow of the mask cast by an 
off-axis source may not simply be a translation of that 
for an on axis source. The finite thickness of the mask 
elements and/or that of a supporting grid, or the par- 
tial transparency of the structure may modify the off- 
axis response. Grindlay and Hong [45] have discussed 
approaches to some of the problems associated with such 
complications. 

B. Finite detector reolution - Background not dominant 

The approach used in [8] and section 6A for deriving the 
expression for the sensitivity considers the problem as 
equivalent to finding the gradient of the best- fit straight 
line in a data space relating the observed counts in a 
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detector pixel, Ci, to the corresponding Pi. In the case 
treated there, Bf » 5, so the errors on each point are 
the same. In the general case the number counts expected 
in detector pixel i, of surface area a, is 3 

Ci = a(B f + P f S) ± VQ (26) 


and the best estimate of S can be found as the gradient 
of the straight line which is a weighted-best- fit to the 
points (Ci,Pi). The uncertainty in the gradient is given 
by 



where A w is simply the weighted equivalent of A : 


^ = (Ef)/(E^)-[(Eg)/(E^)] i 


(assumptions 5, 7-9). 


C. Optimum open fraction with finite detector resolu- 
tion 

In ’t Zand et a 1. [51] have noted how imperfect detector 
spatial resolution tends to decrease the optimum open 
fraction. For the case they discussed, that of the Beppo- 
SAX wide field cameras, they concluded that / in the 
range 0.25-0.33 was the best choice. However, this con- 
clusion was more due to the fact that the fields simulated 
contain several sources whose flux led to a high b than 
to the effects of the detector resolution. 

The shift of f opt to lower values due to imperfect de- 
tector resolution can nevertheless be important when 
studying a single strong source. The arrows in Figure 
2 illustrate the effect with m/d = 2 (shorter arrows) and 
with d = m (a case discussed below in §7). 

D. Poisson Statistics and finite detector resolution 

Finally if the the detector resolution is finite and the 
number of events are so small that Poisson statistics 
must be used, the source flux can be obtained by op- 
timising the Cash likelihood statistic 

^ = (PiS + B f ) (28) 

and the confidence limits obtained by finding the S for 
which C changes by the required amount (with B j refit- 
ted). For calculating the confidence with which the null 
(background only) hypothesis can be rejected one can 
calculate 

y = -£ C MP,S + B f ) + 2nC i \n(C i ), (29) 

(only assumptions 7-9 necessary) 
n being the number of detector pixels. Thus although 
no useful explicit formulae are available in this case, use 
of this statistic provides a method for dealing with any 
particular example. 

3 Note that the C % are counts per pixel where Cd were 
totals for all the pixels of a particular category. 



Fig. 5. Variation of relative values of signal- to- noise ratio, 
angular resolution, and source location accuracy with 
mask element size for a fixed detector element size. As- 
sumptions 3, 5, 8, 9 are made, in particular noise is as- 
sumed to be background dominated {Bf » 5). 


7. Optimizing source position determination ac- 
curacy 

For a given mask to detector separation, dictated 
perhaps by spacecraft accommodation considerations 
and/or a minimum required field of view, the angular res- 
olution of a coded mask telescope depends on the mask 
pixel size and also on that of the detector. Practical is- 
sues usually limit how small the pixels of a detector may 
be made, while the mask design is usually less subject to 
constraints. If we suppose the detector pixel size to be 
fixed, the angular resolution will formally continue to im- 
prove as the mask pixels are made smaller and smaller, 
but as was seen above, with low m/d the significance 
with which sources are detected will suffer. 

Simulations confirm that a good approximation to the 
angular resolution is obtained by taking the Pythagorean 
sum of the the angle subtended by a mask pixel at the 
detector and that subtended by a detector pixel at the 
mask. Near the centre of the field of view 

59 2 = ( m/l ) 2 + ( d/l ) 2 (30) 

where l is the mask- detector separation. Note that in a 
case such as that illustrated in Figure I it is the size of 
the holes (m) that is important, not their pitch. 

The accuracy with which a source can be located is 
better than this by a factor approximately proportional 
to the signal-to-noise ratio S/as of the source 4 so 

= (y)A'[(m//) 2 + (rf//)]l 

= A (^) d=o fc[(m//) 2 + (d/Z)]b (31) 

4 Sometimes S/((ts — 1) is used for a better approximation but 
the differences are small in practical cases. 
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where A: is a constant of the order of unity which depends 
on the exact definition of location accuracy. Substituting 
A from Equation 25 one finds that, on the assumptions 
under which these equations apply (3, 5, 8, 9), the lowest 
position uncertainty is obtained with m = d. This is 
illustrated in Figure 5. 

In the design of an instrument the number of objects 
detectable is likely to also be a consideration. As the 
minimum is relatively shallow, choosing a slightly higher 
m/d will allow additional faint sources to be detected at 
the expense of only a small loss in positioning accuracy 
for brighter ones. The BAT instrument on SWIFT uses 
5/4; a value of 2/1 is baselined for EXIST. 

8. Conclusions 

The formulae presented above offer insight into the way 
in which the inevitable uncertainties due to Poisson 
statistics affect measurements with coded mask tele- 
scopes. In some case the differences between a simpli- 
fied treatment and the more precise one can be quite 
large. For example with the choice of m/d = 1, shown in 
section §7 to give the best source location accuracy, the 
sensitivity is worse by a factor 2 /3 than that which would 
be expected by blind application of a simplified formula. 
Often in astronomy the number of objects observed de- 
pends on the —3/2 power of the detection threshold, so 
use of the simplified approach would lead to an overes- 
timate of that number by a factor 1.8. 

Although the discussion here has been in terms of 
measuring the flux from a particular direction, for exam- 
ple that from a point source, in many cases the results 
can be applied to extended sources by considering the 
flux per angular resolution element from the source. 

It should be noted that systematic errors due to (un- 
corrected) variations in the background level across de- 
tector plane have not been considered nor have been 
those due to ‘ghosts’ (sidelobes) of other sources. Thus 
the results are most directly applicable in the case 
of short observations of relatively bright sources ( e.g . 
gamma- ray bursts) or where the design or observation 
strategy is such as to minimize such errors {e.g. through 
use of scanning, combined with a very large number of 
detector pixels, as in the proposed EXIST black hole 
finder mission). 

The author wishes to thank Roberto Accorsi, David 
Band, Jean in ’t Zand and Craig Markwardt for helpful 
discussions. 
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